Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable applying multiple Configurations that use the same Discovery Handler #432

Merged

Conversation

kate-goldenring
Copy link
Contributor

@kate-goldenring kate-goldenring commented Nov 25, 2021

Signed-off-by: Kate Goldenring [email protected]

What this PR does / why we need it:
closes #431
Multiple configurations should be able to be applied that use the same DiscoveryHandler in order to support deploying different types of brokers to devices and discovering different sets of devices. For example, you may want to deploy two Configurations that use the udev Discovery Handler as follows: one that discovers USB cameras and deploys a broker that grabs frames from the camera and another that discovers GPUs and deploys inferencing pods.

Special notes for your reviewer:
Removes the concept of DiscoveryHandlerStatus which seemed unnecessary.
Removed tests in config_action.rs that were actually failing due to improperly referencing the MockDiscoveryOperator. The DiscoveryOperator should be reworked into a trait structure so it can be more easily mocked.

If applicable:

  • this PR has an associated PR with documentation in akri-docs
  • this PR contains unit tests
  • added code adheres to standard Rust formatting (cargo fmt)
  • code builds properly (cargo build)
  • code is free of common mistakes (cargo clippy)
  • all Akri tests succeed (cargo test)
  • inline documentation builds (cargo doc)
  • version has been updated appropriately (./version.sh)
  • all commits pass the DCO bot check by being signed off -- see the failing DCO check for instructions on how to retroactively sign commits

@kate-goldenring
Copy link
Contributor Author

@romoh @bfjelds can you take a look at this. This is a good bug to fix

@@ -674,25 +604,23 @@ pub mod start_discovery {
endpoint
);
// Only use DiscoveryHandler if it doesn't have a client yet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this comment should change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point thanks!

@bfjelds
Copy link
Collaborator

bfjelds commented Dec 2, 2021

i don't remember exactly why we had a filter for endpoint=Active.

is it possible that what we really wanted was (endpoint+configurationName)=Active?

@kate-goldenring
Copy link
Contributor Author

i don't remember exactly why we had a filter for endpoint=Active.

is it possible that what we really wanted was (endpoint+configurationName)=Active?

Sorry @bfjelds forgot to hit enter on the reply. I think it was to prevent multiple configurations from using the same discovery handler. I dont know why i thought that was something to prevent when we definitely want that to be enabled. I dont see a reason for tracking state as we can tell when we lose connection with the Discovery Handler

}
} else {
return Err(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to see if this is a case we want to handle. Can status be empty? Maybe log an error at least?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is hit when the type of error is not tonic::Status, signaling that it is not a connection error with the Discovery Handler rather one of the Discovery Operator's other functions that errored such as update_instance_connectivity_status. This is an error we'd want to bubble up while connection errors can be retried in case the discovery handler comes back online.
Do you think an error! log should be omitted before the bubble up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might help logging from debuggability perspective. I was mainly curious how we want to handle that case and if we should treat it like a connection error.

Copy link
Contributor

@romoh romoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
nit - we might want to document the design in code to prevent future regressions.

@kate-goldenring kate-goldenring merged commit d4ffea7 into project-akri:main Dec 14, 2021
@kate-goldenring kate-goldenring deleted the multiple-configs-per-dh branch December 14, 2021 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot apply multiple Configurations that use the same DiscoveryHandler
3 participants